The geodata is provided by © OpenStreetMap contributors and is made available here under the Open Database License (ODbL).
*** current_df ***
Country name Regional indicator Ladder score Standard error of ladder score upperwhisker lowerwhisker Logged GDP per capita Social support Healthy life expectancy Freedom to make life choices Generosity Perceptions of corruption Ladder score in Dystopia Explained by: Log GDP per capita Explained by: Social support Explained by: Healthy life expectancy Explained by: Freedom to make life choices Explained by: Generosity Explained by: Perceptions of corruption Dystopia + residual
0 Finland Western Europe 7.842 0.032 7.904 7.780 10.775 0.954 72.0 0.949 -0.098 0.186 2.43 1.446 1.106 0.741 0.691 0.124 0.481 3.253
1 Denmark Western Europe 7.620 0.035 7.687 7.552 10.933 0.954 72.7 0.946 0.030 0.179 2.43 1.502 1.108 0.763 0.686 0.208 0.485 2.868
2 Switzerland Western Europe 7.571 0.036 7.643 7.500 11.117 0.942 74.4 0.919 0.025 0.292 2.43 1.566 1.079 0.816 0.653 0.204 0.413 2.839
3 Iceland Western Europe 7.554 0.059 7.670 7.438 10.878 0.983 73.0 0.955 0.160 0.673 2.43 1.482 1.172 0.772 0.698 0.293 0.170 2.967
4 Netherlands Western Europe 7.464 0.027 7.518 7.410 10.932 0.942 72.4 0.913 0.175 0.338 2.43 1.501 1.079 0.753 0.647 0.302 0.384 2.798
######################################################################################################################################################
Country name Regional indicator Ladder score Standard error of ladder score upperwhisker lowerwhisker Logged GDP per capita Social support Healthy life expectancy Freedom to make life choices Generosity Perceptions of corruption Ladder score in Dystopia Explained by: Log GDP per capita Explained by: Social support Explained by: Healthy life expectancy Explained by: Freedom to make life choices Explained by: Generosity Explained by: Perceptions of corruption Dystopia + residual
144 Lesotho Sub-Saharan Africa 3.512 0.120 3.748 3.276 7.926 0.787 48.700 0.715 -0.131 0.915 2.43 0.451 0.731 0.007 0.405 0.103 0.015 1.800
145 Botswana Sub-Saharan Africa 3.467 0.074 3.611 3.322 9.782 0.784 59.269 0.824 -0.246 0.801 2.43 1.099 0.724 0.340 0.539 0.027 0.088 0.648
146 Rwanda Sub-Saharan Africa 3.415 0.068 3.548 3.282 7.676 0.552 61.400 0.897 0.061 0.167 2.43 0.364 0.202 0.407 0.627 0.227 0.493 1.095
147 Zimbabwe Sub-Saharan Africa 3.145 0.058 3.259 3.030 7.943 0.750 56.201 0.677 -0.047 0.821 2.43 0.457 0.649 0.243 0.359 0.157 0.075 1.205
148 Afghanistan South Asia 2.523 0.038 2.596 2.449 7.695 0.463 52.493 0.382 -0.102 0.924 2.43 0.370 0.000 0.126 0.000 0.122 0.010 1.895
######################################################################################################################################################
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 149 entries, 0 to 148
Data columns (total 20 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Country name 149 non-null object
1 Regional indicator 149 non-null object
2 Ladder score 149 non-null float64
3 Standard error of ladder score 149 non-null float64
4 upperwhisker 149 non-null float64
5 lowerwhisker 149 non-null float64
6 Logged GDP per capita 149 non-null float64
7 Social support 149 non-null float64
8 Healthy life expectancy 149 non-null float64
9 Freedom to make life choices 149 non-null float64
10 Generosity 149 non-null float64
11 Perceptions of corruption 149 non-null float64
12 Ladder score in Dystopia 149 non-null float64
13 Explained by: Log GDP per capita 149 non-null float64
14 Explained by: Social support 149 non-null float64
15 Explained by: Healthy life expectancy 149 non-null float64
16 Explained by: Freedom to make life choices 149 non-null float64
17 Explained by: Generosity 149 non-null float64
18 Explained by: Perceptions of corruption 149 non-null float64
19 Dystopia + residual 149 non-null float64
dtypes: float64(18), object(2)
memory usage: 23.4+ KB
None
######################################################################################################################################################
*** historic_df ***
Country name year Life Ladder Log GDP per capita Social support Healthy life expectancy at birth Freedom to make life choices Generosity Perceptions of corruption Positive affect Negative affect
0 Afghanistan 2008 3.724 7.370 0.451 50.80 0.718 0.168 0.882 0.518 0.258
1 Afghanistan 2009 4.402 7.540 0.552 51.20 0.679 0.190 0.850 0.584 0.237
2 Afghanistan 2010 4.758 7.647 0.539 51.60 0.600 0.121 0.707 0.618 0.275
3 Afghanistan 2011 3.832 7.620 0.521 51.92 0.496 0.162 0.731 0.611 0.267
4 Afghanistan 2012 3.783 7.705 0.521 52.24 0.531 0.236 0.776 0.710 0.268
######################################################################################################################################################
Country name year Life Ladder Log GDP per capita Social support Healthy life expectancy at birth Freedom to make life choices Generosity Perceptions of corruption Positive affect Negative affect
1944 Zimbabwe 2016 3.735 7.984 0.768 54.4 0.733 -0.095 0.724 0.738 0.209
1945 Zimbabwe 2017 3.638 8.016 0.754 55.0 0.753 -0.098 0.751 0.806 0.224
1946 Zimbabwe 2018 3.616 8.049 0.775 55.6 0.763 -0.068 0.844 0.710 0.212
1947 Zimbabwe 2019 2.694 7.950 0.759 56.2 0.632 -0.064 0.831 0.716 0.235
1948 Zimbabwe 2020 3.160 7.829 0.717 56.8 0.643 -0.009 0.789 0.703 0.346
######################################################################################################################################################
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1949 entries, 0 to 1948
Data columns (total 11 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Country name 1949 non-null object
1 year 1949 non-null int64
2 Life Ladder 1949 non-null float64
3 Log GDP per capita 1913 non-null float64
4 Social support 1936 non-null float64
5 Healthy life expectancy at birth 1894 non-null float64
6 Freedom to make life choices 1917 non-null float64
7 Generosity 1860 non-null float64
8 Perceptions of corruption 1839 non-null float64
9 Positive affect 1927 non-null float64
10 Negative affect 1933 non-null float64
dtypes: float64(9), int64(1), object(1)
memory usage: 167.6+ KB
None
######################################################################################################################################################
*** current_df *** Old column names: ['Country name' 'Regional indicator' 'Ladder score' 'Standard error of ladder score' 'upperwhisker' 'lowerwhisker' 'Logged GDP per capita' 'Social support' 'Healthy life expectancy' 'Freedom to make life choices' 'Generosity' 'Perceptions of corruption' 'Ladder score in Dystopia' 'Explained by: Log GDP per capita' 'Explained by: Social support' 'Explained by: Healthy life expectancy' 'Explained by: Freedom to make life choices' 'Explained by: Generosity' 'Explained by: Perceptions of corruption' 'Dystopia + residual'] New column names: ['Country_Name' 'Regional_Indicator' 'Ladder_Score' 'Standard_Error_Of_Ladder_Score' 'Upperwhisker' 'Lowerwhisker' 'Logged_Gdp_Per_Capita' 'Social_Support' 'Healthy_Life_Expectancy' 'Freedom_To_Make_Life_Choices' 'Generosity' 'Perceptions_Of_Corruption' 'Ladder_Score_In_Dystopia' 'Explained_By:_Log_Gdp_Per_Capita' 'Explained_By:_Social_Support' 'Explained_By:_Healthy_Life_Expectancy' 'Explained_By:_Freedom_To_Make_Life_Choices' 'Explained_By:_Generosity' 'Explained_By:_Perceptions_Of_Corruption' 'Dystopia_+_Residual'] ###################################################################################################################################################### *** historic_df *** Old column names: ['Country name' 'year' 'Life Ladder' 'Log GDP per capita' 'Social support' 'Healthy life expectancy at birth' 'Freedom to make life choices' 'Generosity' 'Perceptions of corruption' 'Positive affect' 'Negative affect'] New column names: ['Country_Name' 'Year' 'Life_Ladder' 'Log_Gdp_Per_Capita' 'Social_Support' 'Healthy_Life_Expectancy_At_Birth' 'Freedom_To_Make_Life_Choices' 'Generosity' 'Perceptions_Of_Corruption' 'Positive_Affect' 'Negative_Affect'] ######################################################################################################################################################
False False ###################################################################################################################################################### False
Country_Name Regional_Indicator Year Happiness_Index Logged_Gdp_Per_Capita Social_Support Healthy_Life_Expectancy Freedom_To_Make_Life_Choices Generosity Perceptions_Of_Corruption 0 Afghanistan South Asia 2008 3.724 7.370 0.451 50.800 0.718 0.168 0.882 1 Afghanistan South Asia 2009 4.402 7.540 0.552 51.200 0.679 0.190 0.850 2 Afghanistan South Asia 2010 4.758 7.647 0.539 51.600 0.600 0.121 0.707 3 Afghanistan South Asia 2011 3.832 7.620 0.521 51.920 0.496 0.162 0.731 4 Afghanistan South Asia 2012 3.783 7.705 0.521 52.240 0.531 0.236 0.776 ... ... ... ... ... ... ... ... ... ... ... 2030 Zimbabwe Sub-Saharan Africa 2017 3.638 8.016 0.754 55.000 0.753 -0.098 0.751 2031 Zimbabwe Sub-Saharan Africa 2018 3.616 8.049 0.775 55.600 0.763 -0.068 0.844 2032 Zimbabwe Sub-Saharan Africa 2019 2.694 7.950 0.759 56.200 0.632 -0.064 0.831 2033 Zimbabwe Sub-Saharan Africa 2020 3.160 7.829 0.717 56.800 0.643 -0.009 0.789 2034 Zimbabwe Sub-Saharan Africa 2021 3.145 7.943 0.750 56.201 0.677 -0.047 0.821 [2035 rows x 10 columns]
Changed column "Country_Name`s" datatype to "category" Changed column "Regional_Indicator`s" datatype to "category"
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2035 entries, 0 to 2034
Data columns (total 10 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Country_Name 2035 non-null category
1 Regional_Indicator 2035 non-null category
2 Year 2035 non-null int64
3 Happiness_Index 2035 non-null float64
4 Logged_Gdp_Per_Capita 2011 non-null float64
5 Social_Support 2026 non-null float64
6 Healthy_Life_Expectancy 1984 non-null float64
7 Freedom_To_Make_Life_Choices 2005 non-null float64
8 Generosity 1959 non-null float64
9 Perceptions_Of_Corruption 1931 non-null float64
dtypes: category(2), float64(7), int64(1)
memory usage: 138.9 KB
None
######################################################################################################################################################
Year Happiness_Index Logged_Gdp_Per_Capita Social_Support Healthy_Life_Expectancy Freedom_To_Make_Life_Choices Generosity Perceptions_Of_Corruption
count 2035.000000 2035.000000 2011.000000 2026.000000 1984.000000 2005.000000 1959.000000 1931.000000
mean 2013.826536 5.490948 9.391096 0.814959 63.695212 0.748269 -0.002346 0.746277
std 4.514250 1.107523 1.141129 0.116125 7.376080 0.139289 0.162257 0.186760
min 2005.000000 2.375000 6.635000 0.291000 32.300000 0.258000 -0.335000 0.035000
25% 2010.000000 4.669000 8.484000 0.751000 59.180000 0.656000 -0.117000 0.690000
50% 2014.000000 5.420000 9.487000 0.836000 65.400000 0.769000 -0.029000 0.801000
75% 2018.000000 6.298000 10.370500 0.906750 68.800000 0.861000 0.089000 0.870000
max 2021.000000 8.019000 11.648000 0.987000 77.100000 0.985000 0.698000 0.983000
Column "Perceptions_Of_Corruption"`s values are almost completely missing for country: "China" Column "Healthy_Life_Expectancy"`s values are almost completely missing for country: "Hong Kong S.A.R. of China" Column "Healthy_Life_Expectancy"`s values are almost completely missing for country: "Kosovo" Column "Perceptions_Of_Corruption"`s values are almost completely missing for country: "Turkmenistan" ###################################################################################################################################################### Does our data frame contain any missing values? False
Column "Regional_Indicator`s" unique values are: ['Central and Eastern Europe' 'Commonwealth of Independent States' 'East Asia' 'Latin America and Caribbean' 'Middle East and North Africa' 'North America and ANZ' 'South Asia' 'Southeast Asia' 'Sub-Saharan Africa' 'Western Europe'] Column "Year`s" unique values are: [2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021] Start training our models: Model: linear Model's CV average R2 score: 73.81% Model's CV average MSE loss value: 0.3848 Model's test R2 score: 80.67% Model: XGBoost Model's CV average R2 score: 74.05% Model's CV average MSE loss value: 0.3939 Model's test R2 score: 88.14% Model: hist_boosting Model's CV average R2 score: 75.63% Model's CV average MSE loss value: 0.3819 Model's test R2 score: 87.94% Model: random_forest Model's CV average R2 score: 75.43% Model's CV average MSE loss value: 0.3833 Model's test R2 score: 89.55%
100%|##########| 4/4 [00:07<00:00, 1.86s/it]